Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions
نویسندگان
چکیده
Extraction of event descriptors from news articles is a commonly required task for various tasks, such as clustering related articles, summarization, and news aggregation. Due to the lack of generally usable and publicly available methods optimized for news, many researchers must redundantly implement such methods for their project. Answers to the five journalistic W questions (5Ws) describe the main event of a news article, i.e., who did what, when, where, and why. The main contribution of this paper is Giveme5W, the first open-source, syntax-based 5W extraction system for news articles. The system retrieves an article’s main event by extracting phrases that answer the journalistic 5Ws. In an evaluation with three assessors and 60 articles, we find that the extraction precision of 5W phrases is p = 0.7.
منابع مشابه
Frame Labeling of Competing Narratives in Journalistic Translation
Studying translations during the time of conflict has gained currency in the recent decade in translation studies. One of the cases in which conflict manifests itself is in the way different countries choose to name an event or a geographical location, for example. This study set out to understand how translation of rival names and labeling was carried out in Iranian state-run news agencies. To...
متن کاملFrom Academic to Journalistic Texts: A Qualitative Analysis of the Evaluative Language of Science
This study examined academic articles and journalistic reports in 5 disciplinary areas to explore how similar contents might attitudinally be realized in two different genres. To this end, 25 research articles and 210 news reports were carefully selected and underwent detailed discourse semantic and grammatical analyses with the purpose of identifying the evaluative linguistic patterns....
متن کاملRich Interfaces for Browsing News in Blog Posts
Semantic models of news can enable richer interfaces for end-users to learn the context of news events referenced in blog posts. We present Brussell, a system that uses contentspecific models of news event situations to perform anticipatory information retrieval, organize extraction results and present a novel, structured interface for navigating among the events of a news situation. INTRODUCTI...
متن کاملNovel User Interfaces via Model-Mediated Information Retrieval
Using content-specific models to guide information retrieval can provide richer interfaces to end-users in both navigating news articles and learning the context of news events. We present Brussell, a system that uses semantic models of news event situations to perform anticipatory information retrieval, organize extraction results and present a novel interface for navigating among the mileston...
متن کاملEvent Tracking
This paper introduces Event Tracking, a new application of Information Retrieval technology with interesting research and evaluation questions. We describe the problem, a pilot corpus of news stories that was constructed for experimental studies, and a \rolling" evaluation strategy that uses diierent segments of the corpus for each query. As part of a preliminary evaluation on a small pilot stu...
متن کامل